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There are no small coincidences and big coincidences! There are only 
coincidences! 

From "The Statue" episode of Seinfeld. 



P^ , 1. Introduction. Andrey Feuerverger has undertaken a serious challenge. 

■^ I The subject matter is controversial and finding a sensible way to formulate 

the problem in a rigorous statistical manner is difficult. 

The paper is notable for its thoroughness. We have rarely seen a paper 
on an applied problem that provides so much background material. Most 
importantly, the author is very careful to document all his assumptions and 
to remind the reader that the conclusion is sensitive to these assumptions. He 

1^ ■ resists the temptation to present his results in a sensationalistic way. Rather, 

^\ . he conveys his analysis in a dispassionate, understated tone. Nonetheless, he 

could still end up on Oprah. 

We are trying to assess the probability of a hypothesis when the hypothesis 
is formed after seeing the data. This is a notoriously difficult problem. As 



i-H 
> 

in 

a^ 
o 
o 

^' 

o 

QQ I Feuerverger notes, coincidences are common. But just how common? 

^D ■ One response — the nihilistic approach — is to say that it is impossible and 

stop there. We have much sympathy with the nihilists in a problem like 
this. Perhaps the scientifically honorable path is to say that any answer 
is misleading so it is better to provide no answer. But ultimately this is 

5^ I unsatisfying and we accept the author's approach to provide an analysis 

with many caveats. 

The question may be framed formally as follows. We observe an outcome 
X — a tomb with interesting names — and we want to know: is this outcome 
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2 H. HOFLING AND L. WASSERMAN 

surprising? One way to quantify surprisingness is to perform the following 
steps: 

1. Construct a sample space X that contains x. 

2. Identify all the outcomes A that would have been considered surprising 
if they had been observed. 

3. Construct an appropriate null distribution Pq. 

4. Compute the p- value p = Po{A). 

The most difficult step is identifying the set A of interesting outcomes. It 
is explicitly counterfactual to ask if an outcome would have been surprising 
if it had occurred, knowing that it did not occur. 

2. Feuerverger's approach. What the author has proposed is both inter- 
esting and reasonable. Numerous judgement calls have to be made but they 
have been carefully documented. Our summary of Feuerverger's method is 
this: The sample space is chosen to be sets of names on ossuaries, subject to 
some restrictions. The null measure is essentially random sampling from an 
onomasticon. The author defines a statistic (RR) that maps sets of names 
into products of numbers. These numbers are essentially sample proportions, 
modified to take into account various nuances such as surprisingness of ver- 
sions of names. The result is a very small p- value suggesting that the find is 
indeed surprising. 

The 'Mariamenou rj Mara' inscription has a very big effect on Feuerverger's 
RR statistics. An explanation for this is that the RR statistic becomes more 
significant if broad name categories are being subdivided into special name 
renditions, even if the particular name renditions are not relevant. The fol- 
lowing example illustrates this point: 

A population has three names A, B and C each with frequency 1/3. A has 
2 name renditions Ai (1/3 of A) and A2 (2/3 of A). Our family has two 
members named A and B, and Ai and A2 are both relevant. The uncovered 
tomb has one inscription Ai. When only considering broad name categories, 
we have RR(A) = 1/3, RR{B) = 1/3 and RR{C) = 0. When the null is random 
drawing from the population, the p-value is then 2/3. 

When taking name renditions into account, RR{Ai) = 1/9, RR{A2) = 2/9, RR{B) = 
1/3 and RR{C) — giving p-value of 1/9. The p-value decreased although both 
name renditions were considered relevant. The change in p-value can be even 
more substantial in more complicated cases. 

In this comment, we present a Frequentist and a Bayesian approach that do 
not have this problem and yield quite different results. 

3. A different approach. We would like to consider a different way of 
defining the basic event A. Our approach is more expansive and, as a result, 
more conservative. Instead of asking 'What is the probability of getting this 
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set of names?" we ask "What is the probabihty of getting some interesting 
set of names if one looks at several tombs?" 

Let X be all name sets. Examples of sample points in X are 

X = {Salome}, 

X = {Levi, Hanan, Simon, Mariam}, 

X = {Joseph, Jesus, Sarah}, 

and so on. Define a list of target names S. The list should include all names 
that will spark interest. We take this to be either the big set 

S = {Mariam, Mary, Salome, James, Joseph, Joanna, Martha} 

or the small set 

S* = {Mariam, Mary, Salome, James, Joseph}. 

The name "Jesus" is not included because we will treat it separately. We 
assume that a tomb would have triggered interest if its name set B has 
sufficient overlap with S. We lump together different version of names since 
interested observers would surely argue that a tomb is interesting if there is 
any way at all of matching the found names to potentially interesting names. 
Denote the name sets in the tombs by Si , ... , Sat . Say that Bi is interesting 
if 

\B,nS\>3 and "Jesus" E Bi. 

We denote the probability of this event by ttj. Assuming independence of 
name assignments in and across tombs, the p-value is 

N 



P='^-Y[{'i^-q{ni,TTi)) 



i=l 

where rii is the number of ossuaries in tomb Bi , 

q{ni,TT) =pj¥{Yi > 3), Yi ~ Binomial(?ij - l,zy), 

v is the probability that a single name drawn at random is in S and vrj is 
the probability of drawing the name "Jesus." We do not take vrj to be the 
probability of drawing "Jesus son of Joseph" because the tomb could have 
been considered interesting if it had only said "Jesus." 

For our calculations we take N = 100, n^ = 6. The number 100 comes from 
the fact that there are about 1000 tombs but only 10 percent have been 
excavated. Hence ttj = tt does not vary with i. We consider two possibilities 
for the male-femail ratio: (i) equal or (ii) unequal as represented by the 
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onomasticon. For example, in case S is equal to the first (big) choice, the 
male/female ratio is equal we get 

1 ^231 + 103 + 45\ 1 / 81 + 63 + 21 + 12 ^ ^ ^ ^^^^ 



2V 2509 J 2\ 317 

The value of vr and p for the different combinations of assumptions is as 
follows: 

S m/f ratio vr p- value 

bii equal 0.005 0.393 

big not equal 0.002 0.183 

small equal 0.003 0.290 

small not equal 0.002 0.158 

We reiterate that we have not treated name variations as special. But 
the calculation is invariant under splitting names into subcategories since 
we are finding the probability of a set of interesting names, not a particular 
name. We also ignored family structure. We now consider two variations. 
We consider replacing "Jesus" with "Jesus son of Joseph" by multiplying 
these two probabilities. We also consider taking N = 1000 to reflect the 
unobserved tombs. The results are: 





N = WO 


N = 1000 


Jesus 


0.16 


0.82 


Jesus son of Joseph 


0.01 


0.13 



There is one case where the p-value is small. But the lack of robustness 
of this result does not make us confident in reporting a small p-value. 

We conclude that the observed event is not rare at all. The chance that 
an observer would find a tomb that could be said to contain interesting 
target names is large. This is due to the fact that the interesting names are 
common and that the many tombs provide many opportunities for apparent 
surprises. 

4. Bayesian analysis. Now we consider a Bayesian analysis of the prob- 
lem. We need to compute 

Pie = i\,) = P{^\e = i)P{e = i) 

^ '^ p{x\e = i)P{9 = i) + p{x\e = o)P{e = o)' 

where x denotes the data, 6 = 1 that the tomb is from the NT family and 
6 = that the tomb is from the normal population. 

In the frequentist approach, a partial ordering has to be defined on the 
space of all outcomes. Feuerverger does this using the RR statistic and the 
approach described above uses intersection of name sets. However, discerning 
the exact ordering on the space of outcomes may be hard or people might not 
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agree with it. The advantage of the Bayesian approach is that the alternative 
distribution only has to be defined at the point x and no ordering on the 
space of possible outcomes is needed. 

4.1. Posterior probability. Let us introduce a little more notation at 
this point. Let c be the configuration of a tomb, g be its genealogy, n = 
(ni, . . . ,nK) the broad name categories and r = (ri, . . . ,rj^) the particular 
name renditions. Assuming that every name rendition only depends on 9 
and its broad name category, we can write 

K 
Pix\e) = Pic,g\e)Pin\g,c,e)Y[P{r,\ni,0). 

4 = 1 

Simplifying assumptions: To make the computations easier, we make two 
more assumptions: 

1. The configuration and genealogy we expect to see in the NT family tomb 
is not different from the rest of the population, that is, P{c,g\9 = NT) = 
Pic,g\e = P). 

2. The particular name renditions we expect to see in the NT family tomb 
are no different than what we expect to see in the rest of the popula- 
tion, that is, P{ri\ni,6 = NT) = P{ri\ni,0 = P). This assumption will be 
relaxed later. 

Then the posterior odds are 

P{e = l\x) P{e = l) P{n\c,g,9 = l) 



P{9 = 0\x) P{9 = 0) P{n\c,g,9 = 0)' 

4.2. Distributions. First, we define the prior distribution. Feuerverger 
estimates the number of tombs in the area to be about A^ = 1100. Also, let 
the prior probability of the NT family having a tomb at all be t. Then 

P(9 = l) = t^, P{9 = 0) = 1-P{9 = NT). 

In order to be optimistic, we take t = 1 and get prior odds of 

P{9 = 1) _ 1 
P{9 = 0) ~ 1099' 

This prior can be thought of as a Bayesian approach to account for data 
snooping, that is, the potential to searching through many tombs. 

For the null distribution, names are drawn randomly using the name 
frequencies in Ilan. Men and women are being treated separately and the 
list of names n is treated as unordered. 

When specifying the probability distribution under the alternative, it is 
necessary to weigh flexibility against complexity. Here we want to take the 
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Table 1 
Weights for each of the persons listed 



Scenario 


Jesus 
son of Joseph 


James 


Joses 


Matthew 


Judas 


Others 


Neutral 
Optimistic 

Scenario 


20 

c» 

Marya 
(mother) 


3 

1 

Mariam 

(sister) 


3 
1 
Salome 

(sister) 


62/2509 171/2509 
62/2509 171/2509 

Mary Magdalene 


3 


Others 


Neutral 
Optimistic 


10 

CX) 


3 
1 


3 
1 




3 




3 




Drawing from the set is being done with probabilities proportional to the weights without 
replacement. The weight in the "others" category is the weight for all not listed persons. 



following approach: Specify a set of names from the NT family (separately for 
men and women) and assign each name a weight as to how likely it is to find 
this person in the NT family tomb. Then, the probability of a specific tomb 
is calculated by drawing from the nameset without replacement according 
to the weights. The weights can be determined in an optimistic or more 
conservative fashion (see Table 1). 

For simplicity, the probability of being in the generational ossuary is taken 
to be the same for everyone, under the null as well as the alternative.^ 

Neutral scenario: In this case, we chose the weights in a fashion that 
seemed reasonable to us when we do not consider the information gathered 
from the tomb. Also, each name in the tomb is taken as its broad name 
category and it is assumed that no additional information for special name 
renditions is available for the NT family. 

Neutral with special renditions: Here, we use the same weights as in 
the neutral scenario, however account for the special "Mariamenou r] Mara" 
rendition. Each of the other inscriptions on the ossuaries is not special, so we 
do not make any adjustments for those. A priori, we could not have known 
the inscription "Mariamenou r] Mara," so how do we account for it? Under 
6 = P,we assume that for the Marya name category, the probability of seeing 
a new previously unseen name is 1/80. For 9 = NT, we assume that special 
name renditions are more likely, say 1/10. Assuming that 'Mariamenou r] 
Mara' could in some way be interpreted for Maria (mother), Mariam (sister) 
and Maria Magdalena, this raises the odds by a factor 8 over the neutral 
scenario. 

Optimistic scenario: We also wanted to explore the effect of having very 
optimistic assumptions which are to a large degree influenced by what has 



^This may be viewed as an oversimplification, however as the weights provide ample 
opportunity to fine tune prior beliefs, we do not see this as practically important. 
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Table 2 

Posterior probability that the Talpiyot tomb 

belongs to the NT family under various scenarios 

Scenario Probability 



Neutral 

Neutral — special renditions 

Very optimistic 



3.4% 
21.8% 
64.1% 



been observed in the tomb. Jesus and his mother are taken to be in the tomb 
for sure. For the rest of the men, the weights are equal for both brothers 
and set to the normal name frequency in the population for Matthew and 
Judas. The overall effect of this choice of weights is to effectively ignore the 
Matthew and Judas ossuaries, assume that one of the ossuaries is from a 
brother and one from a sister of Jesus and assign all eligible brothers and 
sisters the same weight. 



4.3. Results. Even in the optimistic scenario, there is only about a du7o 
chance of the tomb belonging to the NT family. In the other two, more 
realistic schemes, the probability is only 22% and 3% (see Table 2). Just as 
Feuerverger, we also did not consider the generational part of the "Judas, 
son of Jesus" ossuary. Including it in the analysis would be possible; however, 
as prior beliefs about a possible son of Jesus are very strong, this may have 
overwhelmed the rest of the analysis and therefore we decided to exclude it. 



5. Conclusion. When asked to analyze these data, we suspect that many 
statisticians would have said that the problem is too vague and would have 
stopped there. We commend Audrey Feurverger for plunging in and doing 
a serious analysis. Our analysis suggests that the finding does not lend sup- 
port to the hypothesis that the find is indeed the tomb of the NT family. 
Ultimately, scholars of history and archeology will judge the validity of the 
claims about this find. 



Department of Statistics 

Stanford University 

Stanford, California 94305 

USA 

E-MAIL: hhoGfiin@stanford.edu 



Department of Statistics 

Carnegie Mellon University 

Pittsburgh, Pennsylvania 15213 

USA 

E-mail: larry@stat.cmu.edu 



